Menu Top
Complete Course of Mathematics
Topic 1: Numbers & Numerical Applications Topic 2: Algebra Topic 3: Quantitative Aptitude
Topic 4: Geometry Topic 5: Construction Topic 6: Coordinate Geometry
Topic 7: Mensuration Topic 8: Trigonometry Topic 9: Sets, Relations & Functions
Topic 10: Calculus Topic 11: Mathematical Reasoning Topic 12: Vectors & Three-Dimensional Geometry
Topic 13: Linear Programming Topic 14: Index Numbers & Time-Based Data Topic 15: Financial Mathematics
Topic 16: Statistics & Probability


Content On This Page
Percentiles: Definition and Calculation Quartiles: Definition and Calculation Percentile Rank and Quartile Rank
Interquartile Range and Quartile Deviation (Implicit)


Percentiles and Quartiles




Percentiles: Definition and Calculation


Definition and Concept

Percentiles are positional measures used in statistics to indicate the value below which a specified percentage of observations in a dataset falls. The $k^{\text{th}}$ percentile, denoted as $P_k$, is the value such that $k$ percent of the observations are less than or equal to this value, and $(100-k)$ percent of the observations are greater than or equal to this value.

Percentiles divide an ordered dataset into 100 equal parts. For example, the 10th percentile separates the lowest 10% of the data from the highest 90%. The 99th percentile separates the highest 1% of the data from the lowest 99%.

Percentiles provide a way to understand the distribution of data and the relative standing of individual observations within that distribution. They are particularly useful for comparing values from different datasets (e.g., comparing a student's score on one test to their score on another test by looking at their percentile ranks).

Examples:


Calculation for Ungrouped Data

To find the $k^{\text{th}}$ percentile ($P_k$) for ungrouped data (a simple list of individual observations), follow these steps:

  1. Order the Data:

    Arrange the $n$ observations in ascending order (from smallest to largest). Let the ordered observations be $x_{(1)}, x_{(2)}, \dots, x_{(n)}$, where $x_{(1)}$ is the smallest and $x_{(n)}$ is the largest.

  2. Calculate the Position:

    Calculate the position or rank ($L_k$) of the $k^{\text{th}}$ percentile value in the ordered list. There are different formulas for calculating percentile position, which can lead to slightly different results, especially for small datasets. A commonly used formula is:

    Position $L_k = \frac{k}{100}(n+1)$

    ... (1)

    Where $k$ is the desired percentile (e.g., 70 for the 70th percentile) and $n$ is the total number of observations.

    Note: Other methods exist, e.g., $L_k = \frac{k}{100}n$ or methods involving different interpolation rules (like the method used in some statistical software). The $(n+1)$ method is often used in introductory texts.

  3. Determine the Percentile Value:

    Based on whether the calculated position $L_k$ is an integer or not:

    • If $L_k$ is an Integer: The $k^{\text{th}}$ percentile $P_k$ is simply the value of the observation located at the $L_k^{\text{th}}$ position in the ordered data. $P_k = x_{(L_k)}$.
    • If $L_k$ is not an Integer: The percentile value lies between two observations. Let $L_k = I + F$, where $I$ is the integer part and $F$ is the fractional part ($0 < F < 1$). The $k^{\text{th}}$ percentile $P_k$ is found by linear interpolation between the values at position $I$ ($x_{(I)}$) and position $I+1$ ($x_{(I+1)}$):

      $P_k = x_{(I)} + F \times (x_{(I+1)} - x_{(I)})$

      ... (2)

      This formula adds a fraction of the difference between the $(I+1)^{\text{th}}$ and $I^{\text{th}}$ values to the $I^{\text{th}}$ value.


Example

Example 1. Find the 70th percentile ($P_{70}$) for the dataset: 15, 20, 25, 18, 22, 12, 30, 19, 28.

Answer:

Given: Dataset: 15, 20, 25, 18, 22, 12, 30, 19, 28.

To Find: The 70th percentile ($P_{70}$).

Solution:

  1. Order the data in ascending order:

    12, 15, 18, 19, 20, 22, 25, 28, 30

  2. Count the number of observations:

    $n = 9$

  3. Calculate the position ($L_k$) for $k=70$:

    $L_{70} = \frac{70}{100}(n+1)$

    $L_{70} = \frac{70}{100}(9+1) = \frac{70}{100}(10) = 7$

    ... (i)

    The position is the 7th observation.

  4. Determine the percentile value:

    Since $L_{70} = 7$ is an integer, the 70th percentile $P_{70}$ is the value of the observation at the 7th position in the ordered list.

    Ordered list:

    121518192022252830
    (1st)(2nd)(3rd)(4th)(5th)(6th)(7th)(8th)(9th)

    The value at the 7th position is 25.

Therefore, $P_{70} = 25$. This means that 70% of the observations in this dataset are less than or equal to 25.


Calculation for Grouped Data

For data presented in a grouped frequency distribution with class intervals, individual values are not known. We estimate percentiles using a formula similar to the median formula. This involves locating the percentile class and then interpolating within that class.

  1. Calculate Cumulative Frequencies:

    Prepare a 'less than' cumulative frequency (cf) column for the distribution. Find the total frequency $N = \sum f_i$.

  2. Find the Percentile Position:

    Calculate the position of the $k^{\text{th}}$ percentile in the cumulative frequency distribution using the formula $\frac{kN}{100}$.

  3. Locate the Percentile Class:

    Find the class interval whose 'less than' cumulative frequency is just greater than or equal to the value $\frac{kN}{100}$. This class interval is called the **percentile class** for $P_k$. The $k^{\text{th}}$ percentile value lies within this class.

  4. Determine Values for the Formula:

    From the percentile class and the cumulative frequency table, identify the following values:

    • $l$: The lower class boundary of the percentile class.
    • $N$: The total frequency.
    • $cf$: The cumulative frequency of the class immediately preceding the percentile class.
    • $f$: The frequency of the percentile class itself.
    • $h$: The class width (size) of the percentile class (assuming equal widths).
  5. Apply the Percentile Formula:

    The formula for calculating the $k^{\text{th}}$ percentile ($P_k$) for grouped data is:

    $P_k = l + \left( \frac{\frac{kN}{100} - cf}{f} \right) \times h$

    ... (3)

This formula is a generalization of the median formula for grouped data, where the median is $P_{50}$ (so $k=50$).



Quartiles: Definition and Calculation


Definition and Relationship to Percentiles

Quartiles are specific percentiles that are widely used to divide an ordered dataset into four equal parts. They are particularly useful for understanding the spread and distribution of the data, especially the central portion.

There are three main quartiles:

Ordered data divided by quartiles into four equal parts

The distance between the first and third quartiles ($Q_3 - Q_1$) is called the Interquartile Range (IQR), which is a measure of dispersion for the central 50% of the data.


Calculation for Ungrouped Data

Calculating quartiles for ungrouped data (a simple list of observations) follows the same procedure as calculating percentiles, using the corresponding percentile values ($k=25, 50, 75$).

  1. Order the Data: Arrange the $n$ observations in ascending order.
  2. Calculate Positions: Use the percentile position formula $L_k = \frac{k}{100}(n+1)$ with $k=25, 50, 75$.
    • $Q_1$ position: $L_{25} = \frac{25}{100}(n+1) = \frac{n+1}{4}$
    • $Q_2$ position: $L_{50} = \frac{50}{100}(n+1) = \frac{n+1}{2}$ (Median position)
    • $Q_3$ position: $L_{75} = \frac{75}{100}(n+1) = \frac{3(n+1)}{4}$
  3. Determine Quartile Values: Find the value at each calculated position.
    • If the position is an integer, the quartile is the value at that integer position in the ordered data.
    • If the position is not an integer, interpolate between the values at the adjacent integer positions using the method described for percentiles.

Note: As with percentiles, different conventions for calculating quartile positions and values exist. The $(n+1)$ method shown above is one common approach.


Example

Example 1. Find the first quartile ($Q_1$) and third quartile ($Q_3$) for the dataset: 12, 15, 18, 19, 20, 22, 25, 28, 30.

Answer:

Given: Dataset: 12, 15, 18, 19, 20, 22, 25, 28, 30.

To Find: The first quartile ($Q_1$) and third quartile ($Q_3$).

Solution:

  1. The data is already ordered in ascending order:

    12, 15, 18, 19, 20, 22, 25, 28, 30

  2. Count the number of observations:

    $n = 9$

  3. Calculate the positions for $Q_1$ and $Q_3$:

Calculate $Q_1$ (k=25):

Position $L_{25} = \frac{n+1}{4} = \frac{9+1}{4} = \frac{10}{4} = 2.5$.

$L_{25} = 2.5$

... (i)

The position is not an integer (Integer part $I=2$, Fractional part $F=0.5$). We need to interpolate between the value at the 2nd position ($x_{(2)}$) and the value at the $(2+1)=3$rd position ($x_{(3)}$).

Ordered list: 12, 15, 18, 19, 20, 22, 25, 28, 30.

$x_{(2)} = 15$, $x_{(3)} = 18$.

Using the interpolation formula $P_k = x_{(I)} + F \times (x_{(I+1)} - x_{(I)})$:

$Q_1 = x_{(2)} + 0.5 \times (x_{(3)} - x_{(2)})$

$Q_1 = 15 + 0.5 \times (18 - 15)$

$Q_1 = 15 + 0.5 \times 3$

$Q_1 = 15 + 1.5$

$Q_1 = 16.5$

... (ii)

Calculate $Q_3$ (k=75):

Position $L_{75} = \frac{3(n+1)}{4} = \frac{3(9+1)}{4} = \frac{3(10)}{4} = \frac{30}{4} = 7.5$.

$L_{75} = 7.5$

... (iii)

The position is not an integer (Integer part $I=7$, Fractional part $F=0.5$). We need to interpolate between the value at the 7th position ($x_{(7)}$) and the value at the $(7+1)=8$th position ($x_{(8)}$).

Ordered list: 12, 15, 18, 19, 20, 22, 25, 28, 30.

$x_{(7)} = 25$, $x_{(8)} = 28$.

Using the interpolation formula:

$Q_3 = x_{(7)} + 0.5 \times (x_{(8)} - x_{(7)})$

$Q_3 = 25 + 0.5 \times (28 - 25)$

$Q_3 = 25 + 0.5 \times 3$

$Q_3 = 25 + 1.5$

$Q_3 = 26.5$

... (iv)

The first quartile ($Q_1$) is 16.5 and the third quartile ($Q_3$) is 26.5.

Note on Median ($Q_2$):

Let's also calculate the median ($Q_2$) for verification. Position $L_{50} = \frac{n+1}{2} = \frac{9+1}{2} = 5$. The 5th value is 20. So, $Q_2 = 20$. This is consistent with 25% of data below 16.5, 50% below 20, and 75% below 26.5.


Calculation for Grouped Data

For grouped data in a frequency distribution with class intervals, quartiles are calculated using the same formula as the $k^{\text{th}}$ percentile formula (Formula 3 from Section I1), by substituting the appropriate value of $k$ (25 for $Q_1$, 50 for $Q_2$, and 75 for $Q_3$) and identifying the corresponding quartile class and its values ($l, cf, f, h$).

Formula for $Q_1$ (First Quartile):

First, find the $Q_1$ class: the class interval whose cumulative frequency is just greater than or equal to $\frac{N}{4}$. Then use the formula:

$Q_1 = l + \left( \frac{\frac{N}{4} - cf}{f} \right) \times h$

... (5)

Where $l$ is the lower boundary of the $Q_1$ class, $cf$ is the cumulative frequency of the class preceding the $Q_1$ class, $f$ is the frequency of the $Q_1$ class, and $h$ is the class width of the $Q_1$ class.

Formula for $Q_2$ (Second Quartile / Median):

First, find the $Q_2$ class (Median class): the class interval whose cumulative frequency is just greater than or equal to $\frac{N}{2}$. Then use the formula:

$Q_2 = \text{Median} = l + \left( \frac{\frac{N}{2} - cf}{f} \right) \times h$

... (6)

Where $l$ is the lower boundary of the $Q_2$ class, $cf$ is the cumulative frequency of the class preceding the $Q_2$ class, $f$ is the frequency of the $Q_2$ class, and $h$ is the class width of the $Q_2$ class.

Formula for $Q_3$ (Third Quartile):

First, find the $Q_3$ class: the class interval whose cumulative frequency is just greater than or equal to $\frac{3N}{4}$. Then use the formula:

$Q_3 = l + \left( \frac{\frac{3N}{4} - cf}{f} \right) \times h$

... (7)

Where $l$ is the lower boundary of the $Q_3$ class, $cf$ is the cumulative frequency of the class preceding the $Q_3$ class, $f$ is the frequency of the $Q_3$ class, and $h$ is the class width of the $Q_3$ class.

For each quartile calculation, you need to identify the correct class interval ($Q_1$ class, $Q_2$ class, or $Q_3$ class) based on the cumulative frequency position, and then use the specific $l, cf, f,$ and $h$ values associated with that particular class in the formula.




Percentile Rank and Quartile Rank


Percentile Rank

While a percentile ($P_k$) is a specific data value that divides the dataset at a certain percentage, the **Percentile Rank** of a specific value $x$ is the percentage of observations in the dataset that are less than or equal to that value $x$. It provides the relative standing of a particular observation or score within its dataset.

If an observation has a percentile rank of $PR$, it means that $PR\%$ of the observations in the dataset are less than or equal to that observation's value.

For instance, if a student's score of 75 has a percentile rank of 90, it means 90% of the students scored 75 or less. Note that some definitions calculate the percentage of values *strictly less than* $x$, while others include values equal to $x$. Using "less than or equal to" is a common convention.

Calculation for Ungrouped Data:

To calculate the percentile rank (PR) of a specific value $x$ in a dataset of $n$ ungrouped observations:

  1. Order the data in ascending order.
  2. Count the number of observations that are less than or equal to the value $x$. Let this count be $C$.
  3. Calculate the percentile rank using the formula:
  4. $\text{PR} = \frac{C}{n} \times 100$

    ... (1)

    Where $n$ is the total number of observations.

    Note: If using the definition of percentile rank as the percentage of values *strictly less than* $x$, the numerator would be $L$ (number of values less than $x$). Some formulas also add $0.5S$ (half the number of values equal to $x$) to the numerator: $PR = \frac{L + 0.5 S}{n} \times 100$. The first formula $PR = \frac{C}{n} \times 100$ is conceptually simpler and commonly used.


Example

Example 1. For the dataset: 12, 15, 18, 19, 20, 22, 25, 28, 30, find the percentile rank of the score 22.

Answer:

Given: Dataset: 12, 15, 18, 19, 20, 22, 25, 28, 30. Value $x = 22$.

To Find: Percentile rank of 22.

Solution:

  1. The data is already ordered: 12, 15, 18, 19, 20, 22, 25, 28, 30.
  2. Count the number of observations less than or equal to 22. These are 12, 15, 18, 19, 20, and 22. The count $C = 6$.
  3. The total number of observations is $n = 9$.
  4. Calculate the Percentile Rank using the formula $PR = \frac{C}{n} \times 100$:
  5. $\text{PR} = \frac{6}{9} \times 100$

    ... (i)

    $\text{PR} = \frac{2}{3} \times 100$

    (Simplifying the fraction)

    $\text{PR} \approx 0.666... \times 100$

    $\text{PR} \approx 66.7$ (rounded to one decimal place)

    ... (ii)

The percentile rank of the score 22 is approximately 66.7. This means that about 66.7% of the scores in this dataset are less than or equal to 22.

Note: If using the "strictly less than" definition (L=5) and the $L/n$ formula, $PR = (5/9) \times 100 \approx 55.6$. If using the $L+0.5S$ formula (L=5, S=1), $PR = (5+0.5 \times 1)/9 \times 100 = 5.5/9 \times 100 \approx 61.1$. This highlights how the definition of percentile rank can vary. The $C/n$ method is often the most straightforward.


Quartile Rank

The term "Quartile Rank" is not a standard statistical term like "percentile rank". It is sometimes used in a less formal sense to indicate which quarter of the ordered dataset a particular value falls into, based on the calculated quartiles ($Q_1, Q_2, Q_3$).

Given a dataset and its calculated quartiles, a value $x$ can be assigned a 'quartile rank' based on its position relative to $Q_1, Q_2$, and $Q_3$:

Alternatively, "quartile rank" might simply refer to the percentile rank corresponding to a specific quartile value (e.g., the quartile rank of $Q_1$ is 25%, of $Q_2$ is 50%, and of $Q_3$ is 75%). However, this is just restating the definition of quartiles as percentiles.

To determine the 'quartile rank' of a specific value $x$ in the first sense, you need to calculate the quartiles ($Q_1, Q_2, Q_3$) for the dataset and then compare the value $x$ to these calculated quartile values.


Example

Example 2. For the dataset from Example 1 (12, 15, 18, 19, 20, 22, 25, 28, 30), determine the quartile rank for the score 19. From Example 1, Section I2, we found $Q_1=16.5$, $Q_2=20$ (Median), and $Q_3=26.5$.

Answer:

Given: Dataset: 12, 15, 18, 19, 20, 22, 25, 28, 30. Value $x=19$. Quartiles $Q_1=16.5$, $Q_2=20$, $Q_3=26.5$.

To Determine: The quartile rank for the score 19.

Solution:

We compare the value $x=19$ with the calculated quartile values:

  • Is $19 \le Q_1$ (16.5)? No, $19 > 16.5$.
  • Is $16.5 < 19 \le Q_2$ (20)? Yes, $16.5 < 19 \le 20$.

Since the score 19 falls in the range $(Q_1, Q_2]$, which is the second quarter of the data (between the 25th and 50th percentiles), it belongs to the **second quartile range**.

Therefore, the 'quartile rank' for the score 19 is considered to be 2.

Note: This interpretation means the value falls within the second 25% of the ordered data (specifically, between the 25th and 50th percentile values). Its exact percentile rank (from Example 1 in this section, calculation for 22 gave $\approx 66.7$, so 19 would be lower, likely between 25 and 50) would provide a more precise measure of its standing.



Interquartile Range and Quartile Deviation (Implicit)


Measures of Dispersion based on Quartiles

While the Range is simple, its sensitivity to outliers is a major drawback. Measures of dispersion based on quartiles provide robust alternatives that focus on the spread of the central portion of the data, making them resistant to extreme values.

These measures are calculated using the first quartile ($Q_1$) and the third quartile ($Q_3$), which are relatively unaffected by the lowest 25% and highest 25% of the data.

1. Interquartile Range (IQR)

The **Interquartile Range (IQR)** is a measure of dispersion that represents the range covered by the middle 50% of the data. It is simply the difference between the third quartile ($Q_3$) and the first quartile ($Q_1$).

Diagram illustrating Interquartile Range (IQR)

2. Quartile Deviation (QD) or Semi-Interquartile Range

The **Quartile Deviation (QD)** is another measure of dispersion derived from quartiles. It is defined as half of the Interquartile Range.


Example

Example 1. For the dataset used in Example 1, Section I2 (12, 15, 18, 19, 20, 22, 25, 28, 30), we found the first quartile $Q_1=16.5$ and the third quartile $Q_3=26.5$. Calculate the Interquartile Range (IQR) and the Quartile Deviation (QD).

Answer:

Given: $Q_1 = 16.5$ and $Q_3 = 26.5$ for the dataset.

To Calculate: Interquartile Range (IQR) and Quartile Deviation (QD).

Solution:

Calculate IQR:

Using the formula IQR $= Q_3 - Q_1$:

IQR $= 26.5 - 16.5$

... (i)

IQR $= 10$

... (ii)

The Interquartile Range is 10.

This means the middle 50% of the scores in this dataset span a range of 10 units.

Calculate QD:

Using the formula QD $= \frac{Q_3 - Q_1}{2}$:

QD $= \frac{26.5 - 16.5}{2}$

... (iii)

QD $= \frac{10}{2}$

QD $= 5$

... (iv)

The Quartile Deviation is 5.